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Sharp failure rates for the bootstrap 
particle filter in high dimensions 

Peter Bickel^, Bo Li-^ and Thomas Bengtsson'^ 

University of California-Berkeley, Tsinghua University and Bell Labs 

Abstract: We prove that the maximum of the sample importance weights 
in a high-dimensional Gaussian particle filter converges to unity unless the 
ensemble size grows exponentially in the system dimension. Our work is mo- 
tivated by and parallels the derivations of Bengtsson, Bickel and Li (2007); 
however, we weaken their assumptions on the eigenvalues of the covariance 
matrix of the prior distribution and establish rigorously their strong conjec- 
ture on when weight collapse occurs. Specifically, we remove the assumption 
that the nonzero eigenvalues are bounded away from zero, which, although 
the dimension of the involved vectors grow to infinity, essentially permits the 
effective system dimension to be bounded. Moreover, with some restrictions on 
the rate of growth of the maximum eigenvalue, wo relax their assumption that 
the eigenvalues are bounded from above, allowing the system to be dominated 
by a single mode. 
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1. Introduction 

Bayesian filtering methods arc a commonly employed tool for combing physical 
models and data. The filters treat the unknown system state as a random variable 
and resolve its probability density conditional on the data (and the system dynam- 
ics) through Monte Carlo sampling techniques. When applied sequentially in time, 
these methods are commonly referred to as particle filters ([8], [10]). For a diverse 
collection of applications and an excellent introduction to the field in general, see 
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the edited volume by Doucet [()]. The particle filter method relies heavily on a likeli- 
hood based reweighting mechanism of the involved sample draws. This reweighting 
scheme produces the so called importance weights, and these weights are the pri- 
mary focus of our work. Specifically, in a Gaussian filter context, we examine the 
behavior of the importance weights as a function of the system dimension and of 
sample size. 

The popularity of the particle filter is no doubt due to the fiexibility of the 
model framework to handle both non-linear and non-gaussian structures. However, 
in spite of its generality, the method is not without flaws: the particle filter is 
known to require large Monte Carlo ensembles and frequent resampling to estimate 
the desired densities (cf., [9]). This drawback is particularly prevalent in higher 
dimensional systems where the filter becomes unstable and quickly collapses onto 
a single point mass. In recent work, for a single Bayes update step in a Gaussian 
setting, Bengtsson, Bickel, and Li [■'] give a derivation of the weight collapse as 
a function of the system dimension and of sample size. To shed further light on 
the weight collapse, this paper establishes conjectures (given in [3]) which make 
their arguments fully rigorous. Just as significantly, we exhibit that collapse is a 
function of the effective dimension (defined in Section 3), rather than the absolute 
dimension. As in [3], our analysis is given in the context of a stylized Gaussian 
example, but we conjecture (and simulations show) that our results are informative 
for situations that depend on similarly defined reweighting schemes. The results 
imply that to avoid collapse, the sample size must grow super-exponentially in the 
effective dimension. We do not investigate refinements of particle filters methods, 
such as simulated tempering [4], although our discussion in Section 2.1 suggests 
that their approach is not a solution to avoid collapse in truly high-dimensional 
settings. 

Our work is outlined as follows. The next section describes the particle filter, 
provides notation, and describes the use of the ensemble method for approximating 
posterior densities. The main developments are then presented in Section 3, where 
we give several results establishing the conditions under which the maximum sample 
weight in a Gaussian particle filter converges to unity. All technical results are 
proved in the Appendix. (We note that some material in Section 2.1 and Section 3 
is given in [3], but is reproduced here for completeness.) 

2. Model setting 

2. 1 . The particle filter 

Let Xt represent the unknown system state at time t, Yt be a noisy data mea- 
surement of Xt, and let Y* represent all data up to and including time t. Based 
on the data Y* and (some) knowledge of the time-evolution of the system state 
from Xt-i to Xt, we seek the posterior distribution p{Xt\Y*-). We assume we have 
available a random sample {X^^} of size n from the prior distribution p(Xt|Y*~^). 
Associated with the prior sample is a set of weights {wj}. We assume further that 
the likelihood density p{Yt\Xt) is computable for arbitrary Xt. 

The particle filter seeks to recursively in time estimate the probability distri- 
bution of the unknown state Xt- At each time t, the probability distribution is 
represented by the sample ensemble {xf^, w{}, and the distribution can be propa- 
gated forward one time-step by evolving each xf^ using the system dynamics. Once 
new data Yt is available, Bayes theorem is used to adjust the weights based on how 
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"close" the associated sample points are to the data. The following schematic de- 
scribes the particle filter: 



Here, at time t (on the left), Bayes theorem combines p(Xf|Y*^^) and Yt to produce 
p{Xt\Y*). The system dynamics, in the above represented by G{-) (middle), is used 
to propagate p{Xt\Y*) one time step and this yields p{Xt+i\Y*). Bayes theorem is 
then again employed to find the posterior p(Xt+i|Y*+^) (right). 

In a particle filter, the above schematic is straightforwardly implemented (at least 
conceptually) using a random sample. We note first that the change-of-variables 
problem represented by the propagation of p{Xt\Y*) can be solved by evaluating 
G(-) at each sample point. We will not discuss the implementation of the forecast 
step here; instead, our focus is on the Bayes update step. As mentioned, the particle 
filter implements the Bayes step by reweighting the prior sample according to the 
likelihood. We note in passing that the particle filter may be derived as a (sequen- 
tial) importance sampler (e.g., [2]) where the proposal distribution is given by the 
prior and the target distribution is given by the posterior. In the schematic be- 
low, which describes a bootstrap-likelihood filter, the prior sample is "converted" 
to a posterior sample by resampling (with replacement) each member x(^ with 

probability proportional to w{ x p{Yt\xf i.e.. 



Although the particle filter has been successfully applied to a variety settings, it 
often produces highly varying importance weights. Remedies to stabilize the filter 
include resampling (renormalizing) the involved empirical measure at regular time 
intervals [8, 9] and marginalizing or restricting the sample space by conditioning on 
a larger information set [10, 11]. Another approach is given by simulated tempering 
[4], which makes use of the regularized likelihood p(yi|x/j)", where < a < 1. 
However, as can be seen from our derivations, e.g. Proposition 3.1, a fixed a does not 
alter the conclusion of collapse. Moreover, for each time point, to obtain samples 
from the target density, simulated tempering generates a sequence of ensembles 
from kernels Ki{-) {i = 1,...,/) such that Kj{-) approaches the desired kernel 
K{-) associated with the posterior density. Unfortunately, for truly high dimensional 
systems, we conjecture that the number of intermediate sampling steps / would be 
prohibitively large and render it practically unfeasible. Thus, such remedies do not 
fundamentally address performance when the filter is applied to very large scale 
systems. For example, as noted by ([1], [1-^)]), when applied in high dimensions, 
the filter collapses to a point mass after a few (or even one!) observation cycles. 
In particular, as will be shown in Section 3, it is the normalized quantity Wi = 
p{Yt\xl^^) / J2j PO^t\x{j) that behaves singularly. 

The next section sets up the necessary notation and formalizes our problem. 

2.2. Monte Carlo scheme 

We formalize our problem as follows. Consider a set of n sample points X = 
{Xi, . . . , Xn}; where Xi G 3?'^ and both the sample size n and system dimension d 
are "large." (To lighten notation, we have dropped the time subscript and the fore- 
cast superscript.) We assume that the sample X is drawn randomly from the prior 



prior ensemble 



posterior ensemble 



, s 
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(or proposal) distribution p{X). New data Y is related to the state X by the condi- 
tional density p{Y\X). For concreteness, a functional relationship Y = f{X) + e is 
assumed, and e is taken to be independent of the state X . The goal is to estimate 
posterior expectations using the importance ratio, i.e., for some function /i(-), we 
want to estimate 

E{h{X)\Y) = / /,(X)^-^ffiM:^dX, 
^ ^ ' 7 ^ ' J p{Y\X)piX)dX ' 

and use 

as an estimator. Based on this formulation, the weights attached to each ensemble 
member 

(1) 



are the primary objects of our study. As mentioned, in large scale analyzes, the 
weights in (1) are highly variable and often produce estimates E{-) which are col- 
lapsed onto a point mass with max{wi) « 1. As illuminated in [.3], this degeneracy 
is pervasive for high-dimensional systems, and appears to hold for a variety of prior 
and likelihood distributions. 

We next consider the case when both the prior and the likelihood distributions 
are Gaussian. 



3. Gaussian case 



We assume a data model given by y = HX + e, where Y is a d x 1 vector, iJ is a 
known d x q matrix, and X is a q x 1 vector. Both the proposal distribution and 
the error distribution are Gaussian with p{X) = N{fix, ^x) and p{e) = iV(0, E^), 
and the noise e is taken independent of the state X. Since the data model can be 
pre-rotated by E^ , we set E^ = I^i without loss of generality (wlog). Moreover, 
since EY = EHX, we can replace X, by {X, - EXi) and F by (F - EY) and 
leave piY\X) unchanged. Hence, wlog we also set jjtx = 0. Further, define, for 
conformable A and B, the inner product {A, B) = A^B (where the superscript 
denotes matrix transpose), and let = {A, A). 

With p{Y\X) ~ N{HX,Id), the weights in (1) can be expressed as 

exp(-||y-gX,||V2) 

E;=iexp(-r-i/x,!iv2)- 

To establish weight collapse for high-dimensional Gaussian p{Y\X) and p{X), we 
first write the exponent in (2) in terms of the singular values of cov{HX). 

Let d' = rank{H). With A?, . . . , A^, the singular values of cov{HX), define the 
d' X d' matrix D ~ diag{\i, . . . , A^'). Then, with Q an orthogonal matrix obtained 
by the singular value decomposition of cov{HX), define the d' x 1 vector V such 
that 

Q'^HX = DV. 
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Note that Vi corresponding to Xi is N{0,ld'). Since Q is orthogonal, we can write 

d' d 



(3) \\Y-HX,r = \\Q^Y-DV,r^J2^M+ E ^0 



j=l j=d' + l 

where, conditional on Y, [Wn, . . . ^Wid']"^ is A^(^,/d')- and where eoj is the jth 
component of the observation noise vector e. The mean vector ^ = [/ii, . . . , fJ.d']'^ is 
given by 

(4) D-^Q'^Y = V + D-^', 

where V and e' are independent N{Q,Id')- 
Now, for i = 1, . . . , n, define 

Note that the second term in (3) is constant for every Xi, and will not influence 
the weight Wi. 

By (2), we can express the maximum weight as 

1 

(6) W(„) 



1 + Tn,d' 



where T„,d' = ^,"^3 e— ^'^(^^(^)-^(i)) with a^, ^ ^ ^^l^ A^l + 2/x2). Thus, to 
prove weight collapse, we need to show convergence of the denominator in (6) to 
unity. We now state the following. 

Proposition 3.1. Let 5*^, z = 1, . . . , n, be independent random variables with cumu- 
lative distribution function (cdf) Gd(-) satisfying the conditions specified in Lemma 
A.l and Lemma A. 2 stated in the Appendix. Let S^i^ < • • • < S^n^ be the ordered 
sequence of Si, ... , Sn, and define, for some a > 0, Tn — X]"=2 e~'^^(^('!)~'^(i)\ 
Then, as, n,d 00, if '°g "^'°g _> g, we have 



y 2 log n 

A proof of the result is provided in the Appendix. For the Gaussian case consid- 
ered here, an immediate implication of Proposition 3.1 is weight collapse. Specifi- 
cally, with two additional assumptions, we may assert the following. 

Proposition 3.2. We assume, for the Gaussian case considered here, 
Al: There is a positive constant 5 such that j- > Ai, • • • , A^' > 5; and 

A2:r2 =f Eti(3Al + 2A|)-a2>0. 
Then, if ^ 0, we have W(„) ^ 1. 

Proposition 3.2 follows by Lemma A. 3 (Appendix) and Proposition 3.1. 

The above result implies that, unless n grows super-exponentially in d' , we have 
weight collapse. We note that Proposition 3.2 is a sharpening of the convergence 
rate as compared to that implied by Section 3.1 of [3]. The logd' term appears only 
because max \ — Op{\/\og d'), and we need to make our analysis conditional on 
the {jJij}. 
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The results in Proposition 3.2 suggest that large d' leads to collapse. However, 
we argue now that what really matters is the effective dimension of X, defined as 
the sum of the singular values of cov{HX). We shall assume that 

B : Ai > A2 > • • • > Xd' > • • • are part of an infinite sequence. 

Our arguments can be modified to the case where {Xj : 1 < j < d'} is a. double 
array, but wc eschew this complication. 
There are two possibilities, 

C30 00 

(i) Y.^'j < or (ii) = 00. 

Wc claim that if (i) holds, there is no weight collapse. That is, if, say, g : TZ t-^ TZ is 
bounded and continuous, 

n 

(7) J2'',g{X:)^Eg{X\Y). 

i=l 

In the above, X* is drawn from the empirical measure WiS{Xi), where S{-) 

represents the delta function, and where, as before, the Wi represents the likelihood- 
defined weights. 

To verify the convergence in (7), note that 

n 



where 

d' 

2 



1 

(8) = c-^} exp{- - ^ [X] [Zl - 1) + 2A>, Z,,] }. 



In (8), the Z^s arc i.i.d. iV(0, 1) and 

d' 

2 



1 

Cd' = E{ exp{- - ^ [A^^ (4 - 1) + 2A>,Z,,] }] 



d' 



n [(l + A^2)-l/2e>,/2g^a+^^)]. 



Now, since (i) implies that Hj^ill + Ap^^^^e'^'j^^ converges and 

- X]ii] _ ^ X]E{ii]) °° 



i=i ^ ' "J i=i * ' "J j=i 



we have E(Ui) = 1 and Cd' — > c (with c a constant). 
Arguing as in Proposition 4.1 in [■>], we can show that 



1 " 1 



n 

i=l 
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since a straightforward computation shows that E{Ul) < M < oo for all d! . Thus, 
under (i) , the importance weights have the correct expectation and vanishing vari- 
ance. 

On the other hand, if (ii) holds, we can state the following proposition. 
Proposition 3.3. Under Ji, ifJ^'jLi '^j ~ °° '^^'^ i^og n log d')/T'^, 0, we have 

V 2 log n 

We note that our conditions imply that 

maxi<j-<rf/A|(l + j/^^) 

so that asymptotic normality holds. The proof requires Lemmas A.l and A. 3. 

The form reveals that it is possible to have much slower collapse than what 
Proposition 3.2 suggests. For instance, if A| = B holds but rj, = log(i'(l+o(l)). 
In fact, the requirement that the Xj form an infinite sequence as above can be 
weakened to requiring simply that the Xj be bounded above uniformly, and this 
can be verified using a subsequence argument. 

In conclusion, on the basis of Proposition 3.3, provided that the nonzero Aj's are 

commensurate, it seems reasonable to define X)j=i ^] effective dimension. 

We note that the form of the effective dimension also plays a crucial role in the work 
of [7], who study Monte Carlo sample size requirements in the ensemble Kalman 
filter framework. 



Appendix 

We first introduce two lemmas that pertain to Edgeworth expansion type uniform 
normal approximations of the distribution (the cdf and the density respectively) 
of independent sums of random variables. The two lemmas lay the groundwork for 
the proof of Proposition 3.1. Valid for moderately large deviations, the first result 
(Lemma A.l) is a special case of Theorem 2.5 in [12], and is stated here without 
proof. 

Lemma A.l. Let be independent random variables with E^^j = and 

cr2 ^ Var{C]) <oo. Set 

Sd^ ^i^i + ---+id), 

Dd 

where = J2j=i '^jj '^'^d define the Lyapunov quotients 

^M = 4E^^|C.f, fc=l,2,.... 

^d 

We also suppose \E(Zj )\ < k\^j~'^a'j,k > 3, where 71, . . . o-f^ constant terms. 

With these conditions, as d — > 00, there exist analytic functions Pd{x) — 
X^fcLs '^fc.d^'^ wit/i \Xk.d\ < Ac'^d ~ for some A,c and all d, such that the cdf 
of Sd, denoted G d{-) , satisfies, 



1 - Gdix) = (1 - <i>{x))exp{Pd{x)){l + o(l)), 
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Gd{-x) = <^>{-x) exp{Pd{-x)){l + o(l)) 

uniformly for all x > and x = o{Bd/Kd), where Kd = maxi< j<c;{7j, ctj}. Fur- 
thermore, Pd satisfies 

(9) \Pd{x)\ < cx^/Bd 

for some constant c > 0. We use c generically as a constant independent of d. 

Lemma A.l gives a normal approximation for the cdf of independent sums, and 
serves as the basis for the normahty conditions of Proposition 3.1. Next we give a 
lemma for a normal approximation of the density of independent sums, which can 
be directly derived from Proposition 2 and Theorem 3 of [5] . 

Lemma A. 2. With the same notation and conditions as in Lemma A.l, we assume 
S^j^d has density gj^d such that sup^{\gj^dix)\ : I < j < d} < M < oo. Then, as 
d oo, the density of Sd, gd{') = Gd{-), satisfies 

gd{x) = c^{x)exp{Pd{x)){l + o(l)), 

gd{~x) = dp{-x)exp{Pd{~x)){l + o(l)) 

uniformly for all x > and x = o{Bd/ Kd), where Kd = maxi<j<(i{7j, dj}. 

We note in passing that the condition of uniform boundedncss of the gj ^ does not 
hold for Zj, the Gaussian-Gaussian case. However, the sum of Af + A|Z|, where 
Ai, A2 > and Z\,Zi are independent Gaussian, does indeed satisfy the condition. 
This may be verified by a direct calculation of the density of the convolution. 

The next lemma is given for the purpose of verifying the Lyapunov quotients 
conditions appearing in Lemmas A.l and A. 2. 

Lemma A. 3. Let Zj, Vj, ej,j = 1, . . . , d, be iid N(0,1). Let Ai > A2 > • • • wh 
Y^°°=i ^] ~ 00. Then, given jij = Vj + for all j, we have 



ere 



(10) 



for k > 3. 



XfE{\{Z,+f,,r-{l + ^^j)\'^\n,) 



< 



kl 



p^A4i?((Z,+M,)'-(l + M')|Mj)' 



Thus, given the mean vector ^ = [^1,^2,- ■■ , Md] defined in (4), Lemma A. 3 
states that the Lyapuanov conditions required by Lemma A.l hold, with probability 
tending to 1. We note that our argument also implies Lemma A.l. 

Proof of Lemma A. 3. Since {Zj + /ij)^ ^ (1 + /^j) = [Z'j — 1) + "^fJ-jZj, it is enough 
to bound 

XfE{\{Z] - 1) + 2M,Z,f < 2\X^^E\Z] - 1|^- + 9j{\fi,\X'^)'^E\Z,\'^). 
By standard properties of the Gaussian moments, for some positive constant C, 
E\Z'f - ll*^ < C'^kl, and E\Zj\'' < C^k\ . 
Since E{Z'j - 1 + 2fLjZjY = 2 + Afi'j we see that (10) follows from the bound 

Af iMjf < A>2niax{|A,V£|'"' : 1 < ^ < 4 = (Op( ^log^))'"': 
since the Xjfij are independent A^(0, Xj + A^) so that 

max{|Af I : 1 < ^ < d} < (A^^ + X^)^/^ max{|T4| : 1 < £ < d} 
where the Vf are i.i.d. A^(0, 1). The lemma follows. □ 
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The remainder of the Appendix is devoted to the proof of the main result given 
in Proposition 3.1. 

Proof of Proposition 3.1. Let Sj (j = 1, . . . , n) be as defined in the Proposition and 
let 5(1) be the minimum. Note that 

(11) E{T,,ASii)) = '-^ ^7^^ — , 

since, given <S'(i), the remaining [n — 1) observations are i.i.d. with cdf equal to 
Gd{z)/Gd{Si^i)), z> S(i). 

Let Ed be a sequence of constants such that £d —* Q and EdTd/^/'^ logn ^ oo as 
n,d oo. We first define, for x < SdTd, 

/>oo 

(12) hn,d{x) := I eyiY>{- Td{z - x))AGd{z). 

J X 

To evaluate h„^d{x), we break the integral into two parts: the first part yields 
the integral from x to x + EdTd, and the second part yields the tail integral from 
X + EdTd to oo. By using the normal approximations of Lemmas A.l and A. 2, 
under the assumption that (log n)/T| —y 0, one can show that the second part is 
o(V2 logn/riTd). 

To deal with the first part, we shall show that a,a x —^ — oo and x > —EdTd, 

(13) / ^ " \xp{-Td{z-x))dGd{z) = -0(x)cxp(Prf(a;))(l + o(l)) 

Jx Td 

To this end, applying Lemma A. 2 with ^ = 3, we obtain, 

Rd{x) := J ' \xp[-Td{z-x)-^{z^-x^) + Pd{z)-Pd{x)]dz{l + o{l)) 

CXp [~TdV~ -{{X + vf - X^) 

OO 

+ ^kAi^ + - ^')]dz;(l + o(l)) 

fc=3 



— I exp[-(-l + — )u; 

\x\ Jo \x\ 2\x 



2 



(14) 



where 



+ J2 ^^'d E(-l)'"'^'=.^- \^\'~^'^] d^(l + 0(1)) 

k=3 j=l 
r-\x\edTi oo 

/ exp [ — biw — b2W + bj'w-^]dw{l + o(l)). 



bi = ^-fe^ = -l + f^-f:(-l)'=-lcMAMN^-^ 



k=3 
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and 

oo 

k=j 

Note \Xk,d\ < ^CgT^ '■'^ and Ck.j < c'', for some eonstants A,cq,c where — 
Vj + Hereafter, we use c as a generic positive constant that does not depend 

on a; and d. Under the assumptions that x — > — oo, |a:| < SdTd (hence |a;|/T(; —^ 0), 
and |a;|A„_d oo, we have, firstly, 

oo 
k=3 

= Acl [3(co|x|/rrf)/(l - {co\x\/Td)) + {co\x\ / r^f / {1 - {co\x\/Td)f] 

(15) = 0(1), 

secondly, 

oo 

(16) b; < X-' MAI^if-^ = \x\-\c\x\lTd)l{\ - c\x\lTd) = o{\x\-'), 
and thirdly, 

2j-l oo 



k=j k=2j 
k=3 



= J2 A(cco)''^l-l/-'^''"^'l-l"^-~'^"^' 



+ 5] Aicco)H\A/rd)'-''T',-'' 

k=2j 

< |x|-2(c|x|Td)^(-'"-2)_^^^2-2, 

(17) < 2\x\-\c\x\rdr^^''\ 
Since w/{\x\rd) < £d ~^ 0, we can further derive 

CSO 

Y^bjW^ < 2{w/xf{cw/{\x\Td)y-^ = 2{w/\x\)^cw/{\x\Td)/[l~cw/{\x\Td)] 

= o{\x\-^)w^. 
Combining (14), (15), (16), and (18) yields 

(18) i?,(x) = - / cxp[-(-l + ^)(l+o(l))^-(-^)(l + o(l))]du;. 

The o(l)'s appearing in the last expression are uniform as w varies over the integral 
interval. Now, the bounded convergence theorem ensures Rd{x) = (l/rd)(l + o(l)), 
which establishes (13). Taking into account the remainder term, we conclude that 

(19) K.dix) = -m cxp {Pd{x)) (1 + o(l)) + o{^^^^) . 

Td ^ UTd ' 
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Our target (t^I \J2 \ogn)E{Tn.d) can now be written as 

/oo 
-oo 

We decompose the preceding integral into three parts 

(21) ^ E{Tn.d) - In.d + IlnA + Illn.d 

V21ogn 

where In,dT IIn,d, and IIIn,d represent the integral of (11) over the intervals [— oo, 
~£dTd]; (— EdTd, — (logri)^/'^), and [— (logn)^/"*, oo), respectively. The preceding dis- 
cussion, combined with the approximation gd{x) = xGd{x){\ + o(l)) as x — > — oo 
and I a; I ~ o{Td), implies that the dominating part is the quantity represented by 
I In A- We have, 

/ -1 \ (log n) 

IlnA = ^^7?=/ xGd{x)Gy\x)dGd{x){\+o{l)) 

r.nGd(-(log«)i/'') 

G^^{w/n)'w{l - w/nTdw{l + oil)) 



V2 log n 
1 

V21og71 JnGdi-edTd) 



1 



nGd(-(log«)i/'») 



V2l0gn JnGd(-edrd) 



v/-2 log(u;/7i)we~'"dw(l + o(l)) 



1 /■°° 

we~'"dw(l + o(l)) ^ / u)log«;e-"'du;(l +o(l)) 

V 2 log n Jo 

(22) = l+o(l). 

To arrive at (22) we have used the approximation G'^^{z) = V— 21ogz(l + o(l)) for 
z ^ in light of Lemma A.l and Mill's ratio. 

For the remaining two parts, we use Mill's ratio and obtain 

In,d + IIIn.d < -=^(n-l)[P(5(i)<-edrrf)+P(5(i)>-(logn)i/4)] 
V21ogri 



Td 



V21og7i 



(n - 1) [1 - G^^i-SdTd) + G'2{- (logn)!/")] 



< -7^=i^ - l)[nGdi-SdTd) +G^^{- (logn)i/4)] 
V21ogri 

(23) 0. 

Finally, combining (21), (22), and (23), yields the desired result. □ 
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